--- title: Bert Visualize keywords: fastai sidebar: home_sidebar summary: "Visualize masked language modeling transformer model" ---
{% raw %}
# !pip install transformers
from transformers import AutoModelForMaskedLM,AutoTokenizer
model = AutoModelForMaskedLM.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased",use_fast=True)
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForMaskedLM were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['cls.predictions.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

A piece of sample text

text = """I must not [MASK].
Fear is the mind-killer.
Fear is the little [MASK] that brings total obliteration.
I will face my fear.
I will permit it to pass over me and through me.
And when it has gone past I will turn the inner [MASK] to see its path.
Where the fear has gone there will be nothing.
Only I will remain."""

class MLMVisualizer[source]

MLMVisualizer(model, tokenizer)

vis = MLMVisualizer.from_pretrained("bert-base-uncased")
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForMaskedLM were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['cls.predictions.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

li[source]

li(x)

infer_logits[source]

infer_logits(vis, y_pred, mask)

predict_text[source]

predict_text(vis, text)

visualize[source]

visualize(vis, text)

visualize_result[source]

visualize_result(vis, result:Config)

%%time
result = predict_text(vis,text)
CPU times: user 460 ms, sys: 44 ms, total: 504 ms
Wall time: 353 ms
%%time
vis.visualize(text)

Masked language modeling visualization

CPU times: user 1.34 s, sys: 86.1 ms, total: 1.42 s
Wall time: 1.13 s
{% endraw %}